Goto

Collaborating Authors

 underrepresented class




Long-Tailed Data Classification by Increasing and Decreasing Neurons During Training

Sakai, Taigo, Hotta, Kazuhiro

arXiv.org Artificial Intelligence

In conventional deep learning, the number of neurons typically remains fixed during training. However, insights from biology suggest that the human hippocampus undergoes continuous neuron generation and pruning of neurons over the course of learning, implying that a flexible allocation of capacity can contribute to enhance performance. Real-world datasets often exhibit class imbalance--situations where certain classes have far fewer samples than others, leading to significantly reduce recognition accuracy for minority classes when relying on fixed size networks. T o address the challenge, we propose a method that periodically adds and removes neurons during training, thereby boosting representational power for minority classes. By retaining critical features learned from majority classes while selectively increasing neurons for un-derrepresented classes, our approach dynamically adjusts capacity during training. Importantly, while the number of neurons changes throughout training, the final network size and structure remain unchanged, ensuring efficiency and compatibility with deployment. Furthermore, by experiments on three different datasets and five representative models, we demonstrate that the proposed method outperforms fixed size networks and shows even greater accuracy when combined with other imbalance-handling techniques. Our results underscore the effectiveness of dynamic, biologically inspired network designs in improving performance on class-imbalanced data.


T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection

Varaganti, Manikanta, Vankayalapati, Amulya, Awad, Nour, Dion, Gregory R., Brattain, Laura J.

arXiv.org Artificial Intelligence

T2ID-CAS: Diffusion Model and Class A ware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection Manikanta V araganti 1, Amulya V ankayalapati 2, Nour A wad 2, Gregory R. Dion 2, and Laura J. Brattain 1,3 1 Department of Computer Science, University of Central Florida, Orlando, FL, USA 2 Department of Otolaryngology Head Neck Surgery, University of Cincinnati College of Medicine, OH, USA 3 Department of Internal Medicine, University of Central Florida College of Medicine, Orlando, FL, USA Abstract -- Neck ultrasound (US) plays a vital role in airway management by providing non-invasive, real-time imaging that enables rapid and precise interventions. Deep learning-based anatomical landmark detection in neck US can further facilitate procedural efficiency. However, class imbalance within datasets, where key structures like tracheal rings and vocal folds are underrepresented, presents significant challenges for object detection models. T o address this, we propose T2ID-CAS, a hybrid approach that combines a text-to-image latent diffusion model with class-aware sampling to generate high-quality synthetic samples for underrepresented classes. This approach, rarely explored in the ultrasound domain, improves the representation of minority classes. Experimental results using YOLOv9 for anatomical landmark detection in neck US demonstrated that T2ID-CAS achieved a mean A verage Precision of 88.2, significantly surpassing the baseline of 66.


Diabetic Retinopathy Detection Using CNN with Residual Block with DCGAN

Aronno, Debjany Ghosh, Saeha, Sumaiya

arXiv.org Artificial Intelligence

Diabetic Retinopathy (DR) is a major cause of blindness worldwide, caused by damage to the blood vessels in the retina due to diabetes. Early detection and classification of DR are crucial for timely intervention and preventing vision loss. This work proposes an automated system for DR detection using Convolutional Neural Networks (CNNs) with a residual block architecture, which enhances feature extraction and model performance. To further improve the model's robustness, we incorporate advanced data augmentation techniques, specifically leveraging a Deep Convolutional Generative Adversarial Network (DCGAN) for generating diverse retinal images. This approach increases the variability of training data, making the model more generalizable and capable of handling real-world variations in retinal images. The system is designed to classify retinal images into five distinct categories, from No DR to Proliferative DR, providing an efficient and scalable solution for early diagnosis and monitoring of DR progression. The proposed model aims to support healthcare professionals in large-scale DR screening, especially in resource-constrained settings.


Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation

Zhou, Yihang, Towning, Rebecca, Awad, Zaid, Giannarou, Stamatia

arXiv.org Artificial Intelligence

Surgical scene segmentation is essential for enhancing surgical precision, yet it is frequently compromised by the scarcity and imbalance of available data. To address these challenges, semantic image synthesis methods based on generative adversarial networks and diffusion models have been developed. However, these models often yield non-diverse images and fail to capture small, critical tissue classes, limiting their effectiveness. In response, we propose the Class-Aware Semantic Diffusion Model (CASDM), a novel approach which utilizes segmentation maps as conditions for image synthesis to tackle data scarcity and imbalance. Novel class-aware mean squared error and class-aware self-perceptual loss functions have been defined to prioritize critical, less visible classes, thereby enhancing image quality and relevance. Furthermore, to our knowledge, we are the first to generate multi-class segmentation maps using text prompts in a novel fashion to specify their contents. These maps are then used by CASDM to generate surgical scene images, enhancing datasets for training and validating segmentation models. Our evaluation, which assesses both image quality and downstream segmentation performance, demonstrates the strong effectiveness and generalisability of CASDM in producing realistic image-map pairs, significantly advancing surgical scene segmentation across diverse and challenging datasets.


Dialog speech sentiment classification for imbalanced datasets

Nicolaou, Sergis, Mavrides, Lambros, Tryfou, Georgina, Tolias, Kyriakos, Panousis, Konstantinos, Chatzis, Sotirios, Theodoridis, Sergios

arXiv.org Artificial Intelligence

Speech is the most common way humans express their feelings, and sentiment analysis is the use of tools such as natural language processing and computational algorithms to identify the polarity of these feelings. Even though this field has seen tremendous advancements in the last two decades, the task of effectively detecting under represented sentiments in different kinds of datasets is still a challenging task. In this paper, we use single and bi-modal analysis of short dialog utterances and gain insights on the main factors that aid in sentiment detection, particularly in the underrepresented classes, in datasets with and without inherent sentiment component. Furthermore, we propose an architecture which uses a learning rate scheduler and different monitoring criteria and provides state-of-the-art results for the SWITCHBOARD imbalanced sentiment dataset.


Dealing with Imbalanced Data in TensorFlow: Class Weights

#artificialintelligence

It is frequent to encounter class imbalance when developing models for real-world applications. This occurs when there are substantially more instances associated with one class than with the other. For example, in a Credit Risk Modeling project, when looking at the status of loans in historical data, most of the loans being granted have probably been paid in full. If models susceptible to class imbalance are used, defaulted loans would probably not have much relevance in the training process, as the overall loss continues to decrease when the model focuses on the majority class. To make the model pay more attention to examples where the loan was defaulted, class weights can be used so that the prediction error is larger when an instance of the underrepresented class is incorrectly classified.


Common Practices -- Part 3

#artificialintelligence

These are the lecture notes for FAU's YouTube Lecture "Deep Learning". This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was created with deep learning techniques largely automatically and only minor manual modifications were performed. If you spot mistakes, please let us know!


Bayesian active learning for production, a systematic study and a reusable library

Atighehchian, Parmida, Branchaud-Charron, Frédéric, Lacoste, Alexandre

arXiv.org Machine Learning

Active learning is able to reduce the amount of labelling effort by using a machine learning model to query the user for specific inputs. While there are many papers on new active learning techniques, these techniques rarely satisfy the constraints of a real-world project. In this paper, we analyse the main drawbacks of current active learning techniques and we present approaches to alleviate them. We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process: model convergence, annotation error, and dataset imbalance. We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size. Finally, we present our open-source Bayesian active learning library, BaaL.